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Abstract 

This paper addresses the regularization by sparsity constraints by 
00 , means of weighted £ p penalties for < p < 2. For 1 < p < 2 special 

attention is payed to convergence rates in norm and to source conditions. 
As main results it is proven that one gets a convergence rate of y/~5 in the 
2-norm for 1 < p < 2 and in the 1-norm for p = 1 as soon as the unknown 
solution is sparse. The case p = 1 needs a special technique where not 
only Bregman distances but also a so-called Bregman- Taylor distance has 



< 

, to be employed. 



For p < 1 only preliminary results are shown. These results indicate 
that, different from p > 1, the regularizing properties depend on the 
interplay of the operator and the basis of sparsity. A counterexample for 
p = shows that regularization need not to happen. 

AMS Subject classification: Primary 47A52; Secondary 65J20, 65F22. 



1 Introduction 



In this paper we discuss the regularizing properties of so-called sparsity con- 
straints. We consider linear inverse problems with a bounded operator A : 
' X — ■> Y between two Hilbert spaces. Our setting is classical [12]: We assume 

that we are given noisy data g s € Y such that there exists g + = Af + with 
Us" 1 " — 9 S \\y < S< Our aim is to reconstruct /+ from the noisy data g s . It 
is well known that this problem is ill-posed if and only if the range of A is 
' non-closed [12]. 

Recently regularization with sparsity constraints has become popular due to 
the influential paper [9]. In this setting one assumes, that the unknown solution 
has a sparse representation in a certain orthonormal basis or frame (jpk) of A, 
i.e. the unknown solution / + can be expressed as / + — ^2 u ki>k where the sum 
consists of a few (and especially finitely many) terms only. This knowledge is 
used to set up a so-called sparsity constraint for Tikhonov regularization, i.e. the 
regularized solution is given as a minimizer of 

||4/-<? 5 ||^ + a I>^(l (f\1>k)\) 
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with a suitably chosen function <f). The parameter a > is a regularization pa- 
rameter and the weighting sequence Wk > allows to regularize each coefficient 
individually. For the weighting sequence we assume that it is bounded away 
from zero: Wk > wq > 0. Several choices of <j> are possible. In [9] it is argued 
that the choice <p(s) — s p for 1 < p < 2 promotes sparsity of the minimizer. 
A heuristic explanation is that this functions give a higher weight to small co- 
efficients and lower weight to large coefficients. Of course the cases p < 1 or 
even p = will produce sparse minimizers but in this case the convexity of the 
functional is lost and minimizers need not to exist (see [16] for a discussion of 
the case A = I), 

For notational convenience we introduce the synthesis operator B : I 2 — > X 
defined by Bu = J^k^^k- We define K = AB and rewrite the Tikhonov 
functional as 

^(u) = \\Ku-g s \\ 2 Y + aJ2^k\u k \ p . (1) 

The calculation of a minimizer of the above functional is not a straightforward 
task. Convergent algorithms in the infinite dimensional setting for 1 < p < 2 
were proposed and analyzed in [4-6,9,10,14]. Generalizations to joint sparsity 
[13], nonlinear operators [2, 17, 18] and the case p = [1] have been proposed. 

In this paper we are going to discuss the regularizing properties of sparsity 
constraints. First results on this topic can be found in [9] where convergence of 
the minimizers in X (resp I 2 ) for vanishing noise and the parameter choice a(6) 
such that a — > and 5 2 /a — > has been shown. Moreover, it is shown that, 
in the special case of wavelet bases with a special class of weights which lead 
to Besov spaces, convergence rates can be achieved. The paper [18] also deals 
with convergence of the minimizers and the proofs there show that convergence 
in the stronger i 1 norm holds. Sparsity constraints can also be discussed in the 
framework of regularization in Banach spaces like, e.g., in [7,8,15,19,20]. In 
these papers convergence rates for general convex regularization are given in 
terms of Bregman distances. In this paper we focus on convergence rates for 
sparsity constraints in norm, i.e. in the norm in X resp. £ 2 or the ^-norm. 

The paper is organized as follows. Section [2] presents auxiliary results and in 
Section [3] results on convergence rates for Tikhonov regularization with ([I]) for 
1 < p < 2 are presented, especially we illustrate the role of the source condition. 
Section [D treats the case p — 1 which is considerably different and a different 
technique has to be used. The Section [5] collects preliminary results on the 
regularization with p < 1. Here, no convergence rates can be given so far, and 
are not to be expected in general. In the last section we draw conclusions. 

Notation. We denote with t v w the weighted £ p space, i.e. the sequences u 
such that J2 w k\ u k\ p converges. We consider the spaces for < p < oo 
which are normed spaces (quasi-normed for p < 1) when equipped with the 
(quasi-)norm ||u|| pw = (X) VJ k\uk\ p ) 1 ^ p - By £° we denote the set {u : N — > 
R : Uk 7^ for finitely many k} of finitely supported or sparse sequences and 
with £° the set {u : N — > R : ^ w/c sgn(|ufc|) < oo}. For simplicity we write 
||m|| = ||m|| 2 and the inner product of u,v e t 2 is denoted by (u\v). Moreover, 
we will frequently use component-wise application of operators to sequences, 
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e.g. (\u\ p ) k = \u k \ p or (wu)k = w k u k . With 

{{1} for x > 

[-1, 1] for x = 
{-1} for x < 0. 

we denote the multivalued sign while sgn stands for the usual sign with sgn(0) = 
0. For an operator A : X — > Y between two Hilbert spaces the Hilbcrt space 
adjoint is denoted by A* : Y — > X. 



2 Preliminary results 

In this section we collect preliminary results which arc needed in the following. 

As a first result we report that the cases 1 < p < 2 indeed promote sparsity 
and that p = 1 lead to finitely supported minimizers. 

Lemma 2.1. Let 1 < p < 2. A minimizer u* of \& /rom (QJ) fulfills 

Proof. Every minimizer u of <f> fulfills 

-2K*(Ku-g s ) G awpSgn(u)|u| p - 1 . (2) 

For p > 1 the inclusion becomes an equation and since the left hand side is an 
£ 2 sequence, the right hand side is also in i 2 . It follows that 

5>2 K|2(p -i )<cxx 

For p — 1 assume that u ^ £° 2 i.e. the sum ^ sgn(|ufe|) diverges. Hence, 
every other choice of a sign in also leads to a diverging sum and it follows that 
the left hand side in can not be an I 2 sequence, which is a contradiction. □ 

The next statement is on convergence of minimizers of ([TJ for S — * 0. 

Theorem 2.2 ( [9]). Assume that either p > 1 or K is injective, w k > wq > 0, 
and let u a ' S be a minimizer of "J from (QJ). // the parameter choice a{5) fulfills 

8 2 

lim a(5) = 0, lim — — = 
<5^o <5—o a(5) 

then it holds 

lim llw^-w+H =0. 

This says that that the method is indeed a regularization. To get a state- 
ment on the rate of convergence the true solution u + has to fulfill some source 
condition. This will be topic of sections [3] and [H 

Next we state a basic inequality which we will need in the following. 

Lemma 2.3 ( [4]). Let 1 < p < 2. For C > and L > it holds for every 
sisR with \s\ < C and \t — s\ < L 



\t\P - \s\ p > psgn^lsl^ - s) + K\t - s\ 



with K — 2 (C+L)2-P ■ 
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3 Regularization with 1 < p < 2 

In this section we analyze the "easiest" case 1 < p < 2. The main result goes 
as follows. 

Theorem 3.1. Let 1 < p < 2, w k > wq > and let u a ' S be a minimizer of ^ 
given in {!]]. Furthermore let u + fulfill the source condition 

3d € Y : wsgn(u+)|u + | p - 1 = K*0. (3) 

Then for the choice a <~ 5 it holds 

\\Ku a ' 5 ~ g 5 \\ Y =0(S) for 6^0 (4) 
J2™ k K' S -u+\ 2 =0(5) for 5^0. (5) 

Proof. Due to the minimizing property we have 

|| Ku a ' s - g s \\l + a Wk \uf \ p < S 2 + a £ w k \u+ 1" 

which gives 

\\Ku a - s - g s f Y + aJ2MK 5 \ P ~ K\ P ) < S 2 - 

Since \u k \ and \u^' S — u k \ can be bounded uniformly in k (the second due to 
Theorem I2.2f) we can apply Lemma 12.31 which yields 

||^^-/||^+ aK ^^|<' 5 - U +| 2 + P a^u; fe sgn(4)|4r 1 K' 5 - M +)< ( 5 2 
Rearranging gives 

\\Ku a ' 5 - g 5 \\ Y +aKj2 w kK' 5 - u k\ 2 ^ 62+a (pwsgn(u + )\u + \ p - 1 \u+ -u a - 5 ) . 
Applying the source condition ^ and the Cauchy-Schwarz inequality leads to 

\\Ku a < 5 ~g S \\ 2 Y + anJ^^kK' 6 -u+\ 2 <5 2 + ap \\6\\ Y \\K(u+ - u a > 5 )\\ Y . 
Adding and subtracting g in the last norm and denoting p = \\8\\ Y p/2 leads to 

\\Ku a ' S - g s \\ Y + ok^ w k \u^' S - u+\ 2 <S 2 + 2ap8 + 2ap \\Ku a ' S - g s \\ Y . 
Rearranging and completing the squares gives 

(\\Ku a - 5 - g 5 \\ y - ap) 2 + anJ2w k \uf ~ u+\ 2 < (S + ap) 2 . 
This finally implies 

\\Ku a ' S ~g s \\ Y <S + 2ap (6) 

and 

i «,<5 +i2 s {S + ap) 2 
}_^w k \u k -u+| < — . (7) 

The assertion follows with a ~ 8. □ 



Since Wk > wq we can deduce the following corollary immediately. 
Corollary 3.2. Under the assumptions of Theorem \3.1\ it holds 

We state a few remarks to illustrate Theorem 13. II 

Remark 3.3 (Constants in the O-notation). From ^ one deduces that 

\\Ku a ' s - gf\\ Y < (l + 2p)S 

and hence, the constant in the O notation only depends on p. From the esti- 
mate (0) we have 

El a,S +i2 ^ (1 + P) 2 X 
w k\u k -ul\ < S. 
n, 

In this case, the constant depends also on k from Lemma POI for which it holds 

1 _ 2{C + L) 2 -p 
k p(p - 1) 

where C is an upper bound on \ut\ and L is an upper bound for \u^' S — ut\. 
The value L tends to zero for 5 — > and C depends on u + only and hence, C 
and L are uniformly bounded for S — > 0. Finally, we see that the constant 
mainly depends on p and is large for small p and namely it tends to infinity for 
p — > 1. To summarize, we may say that the regularization with a weighted l v - 
norm leads to a convergence rate of order \fb in the 2-norm but the associated 
constant gets arbitrarily large for p close to one. Hence, one may not assume a 
similar theorem to hold for the limiting case p = 1. Fortunately, Theorem 4-3 
below shows that this pessimism unfounded. 

Remark 3.4 (The results of Burger and Osher [7]). In the case of a general 
convex and lower- semicontinuous penalty functional J, Burger and Osher proved 
that the source condition 

36: K*9edJ(u + ) 

leads to a convergence rate 

D i (u a - S ,u+) = O(S) 

for u a ' S minimizers of 

\\Ku — g s \\ Y + otJ{u). 

Here dJ denotes the subgradient of J , £ S dJ(u + ) and 

D^u"' 5 , u+) = J{u a ' s ) - J(u+) - (f | u a - s - u+) 

is the Bregman distance, see also [15]. One can also deduce Theorem \3.1\ from 
this result by noting that this source condition is precisely the one in Theorem \3.1\ 
and that for 1 < p < 2 the Bregman distance of J{u) = ^2wk\uk\ p can be 
bounded from below: 

J2wk\u a > S - u+\ 2 <D i (u a ' 6 ,u+) 

for \\u a ' S — < M (which follows from Lemma \2.3\ or the inequalities of Xu 
and Roach, see [21-23]). 
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Remark 3.5 (Source conditions in terms of ^-spaces). In the classical (quadratic) 
theory the source condition can usually be interpreted as some kind of smoothness 
condition. When working in sequence space, we see that the source condition 
says something about the decay of the solution u + . We assume that the operator 
under consideration has the property range K* — t\ where we assume that the 
space 11 is contained in £ 2 . Hence, the dual space (11)' = l q v , with dual exponent 
q' = q/(q — 1) and dual weight u -1 /^-!) is larger than £ 2 . One may say that the 
operator K : t q , — > Y has a "smoothing" (or better "damping") property. Now, 
the source condition reads as w sgn(u + )\u + \ p ~ 1 G ranged* = tl and hence 

v k w< k\ u k l 9 ^" 1 -* < oo or equivalently u + € l q }^ q . 

4 Regularization with p = 1 

We now turn to the case p = 1. In this case previous results give convergence 
rates in the Bregman distance only [7,15,19,20]. Moreover, Remark 13.41 does 
not apply, since the function J(u) = ^2 k Wk\v,k\ is not strictly convex and hence, 
the Bregman distance with respect to the functional J(u) — Y] ^fci^fci can not 
be estimated by the f 2 -norm in general. It holds dJ(u) — (u>k Sgn(itfc))fc. One 
sees that the Bregman distance fulfills 



Consequently, the Bregman distance is zero as soon as the signs of u and u + 
coincide and a convergence rate regarding the Bregman distance does not give 
satisfactory information, see also [8] . 

To prove a convergence rate like in Thcorcm l3.1l we need the following lemma 
which can be found in similar form in [5]. As an important ingredient we need 
the so called FBI property, also from [5]. 

Definition 4.1. An operator K : £ 2 — > Y mapping into a Hilbert space has the 
finite basis injectivity (FBI) property, if for all finite subsets I C N the operator 
K\j is infective, i.e. for all u, v € I 2 with Ku = Kv and = Vk = for all 
k I it follows u = v. 

The lemma gives an estimate which compares the Bregman distance with 
the £ 1 -norm. 

Lemma 4.2. Let u + have finite support, Wk > wq > 0, let K fulfill the FBI 
property, and define 



D^u,u+)<2 Nl- 



(» fc >OAu fe <0) 
V( Ufc <0A U(e >0) 



T(u) 
R(u) 



\\K(u-u+)\\ 2 Y (8) 
J2w k \u k \ - ^2w k \u+\ -^w fc sgn(u+)(u fe - u+). (9) 



Then there exists A > such that 



R{u)+T(u) > A \\u-u 



Hi 



whenever \\u — < M. 
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Proof. We define I — {k | sgn(u^) = ±1} which is a finite set. We estimate 

R( u ) = ^w k \u k \ - w fe |Ufc | - w k sgn(ul)(u k - u~l) 
k 

= ~^2,w k \u k \ - w k sgn(u^)u fc 
k 

> ^w k \u k \ - w fe sgn(w+)w fe = ^w k \u k \. 

k0 k0 

Denoting with 1° the complement of / and with P/c the projection onto the 
subspace where all coefficients in I are zero we get (using u~l — for k e I c ) 

R{u) > w ||Pjc(u-tt+)|| 1 . 

Since ||Pfc(u — w+)|| < M we can estimate 

p(u)>^||pM^ + )|| 2 - (io) 

To establish an estimate for the remaining part Pjm we start with u = 
Pju + P/cu and use the inequalities of Cauchy-Schwarz (in the form — (u\ v) < 
||m|| \\v\\) and Young (ab < ^- + b 2 for a, b > 0) to get 

\\Kuf Y = \\KPiu\\y + 2 (KPiu\ KPicu) + \\KPiou\\ Y 

^ WKPjufy 2 
> 2 — ~ ll XP/cM llv 

>^^-W| 2 |l^l| 2 . (ID 

Since / is finite and K obeys the FBI property there is a constant c > such 
that 

c\\Piu\\ 2 < \\KP IU \\ 2 Y . 

Moreover, again since / is finite, we can estimate the 2-norm from below by the 
1-norm which leads to 

c\\Piu\\l < \\KP IU \\ 2 Y . 
Combining this with gives 

ii^iii^^ii^iiy+ii^iriiP/^n 2 ) 

Applying this estimate to u — u + instead of u and adding the inequality (|10p 
leads to 

h-u+\\ 2 < -{T{u) + \\K\\ 2 \\P I .{u~u+)\\ 2 ) + —R(u). 
in c Wq 

By estimating the 1-norm from below by the 2-norm in (fTTH) we get ^P(it) > 
\\Pic(u — u + ) 1 1 2 and hence, 

\\u-um<lT(u) + M(ml + l)R(u) 

" 111 C Wo c 

which proves the claim. □ 
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While the term R from is a Bregman distance, the term T from ([5]) can 
be seen as Taylor distance: We define the functional F(u) — \\Ku — g s \\ Y and 
observe that the term T can be rewritten as 

T(u) = F(u) - F(u+) - (F'(u+)\ u-u + ). 

Consequently, T is the remainder of the Taylor expansion of the fidelity term 
F. Therefore, Lemma 14.21 can be seen as an estimate on the Bregman- Taylor- 
distance R + T. 

Lemma 14.21 enables us to prove the main result of this paper: 

Theorem 4.3. Let u + have finite support, Wk > wo > 0, K obey the FBI 
property, and let furthermore u + fulfill the source condition 

38 £Y: wsgn(u + ) = K*6. (12) 

Then for every 

u a ' 5 e argmin \\Ku — g S \\ Y + a Wk\uk\ 

it holds 

Utt^-tt+Hj = o(Vs). 

Proof. Due to the minimizing property we have 

0<\\Ku + -g s \\l + aJ2wk\ut\-\\Ku a ' s -g s \\l-aJ2^kK 5 \ 

k k 

= \\Ku+-g 5 \\ 2 Y -\\Ku^ s -g s \\ 2 Y 

+ a (Yl w k\ u t\ ~ w k\ u k' 5 \ +J2 w k sgn(u+)(u^' S - u+)) 

k k k 

- a w k Sgn(u£ )(v%' S - ul ). 

k 

Rearranging gives 

aR(u a - s ) <S 2 - \\Ku a - 5 - g 5 \\ 2 Y - a]T Wfc sgn(u+)«' 4 - «+). 

k 

Since the convergence u a ' S — > u + is known from Theorem 12.21 we can use 
Lemma l4~2l to obtain 

aX \\u a ' 5 - u+Wl-a \\K(u a ' s - u+)\\y < 8 2 -\\Ku a ' S - g s \\ Y -aY w k ^(u+){u a k > & -u+). 

k 

With the source condition (|12p . the notation p = ||0||y/2, and the Cauchy- 
Schwarz inequality this gives 

a\ \\u a ' 5 - u + \\l~a \\K{u a ' 5 - u+)\\ Y < S 2 -\\Ku a ' 5 - g 5 f+a2p \\K{u a ' 5 - u+)\\ . 

Adding and subtracting g s in the last norm and rearranging leads to 

a\ \\u a < s - u + \\l-a \\K(u a ' 5 - u + )\\ Y +\\Ku a > s - g 5 \\ \-2ap \\Ku a ' 5 - g s \\ Y < 5 2 +2apS. 
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Using 



2 



J I 



\\K{u a - a -u+)\\ Y <\\Ku a - d - g d \\ Y + 2S\\Ku a ' d - g d n ^ , .. 
leads to 

aX \\u a ' 5 - u+||J+(l-o) \\K(u a ' S - g 5 )\\ Y -2a(p+6) \\Ku a > 5 - g s \\ y < {l+a)S 2 +2ap5. 
Dividing by (1 — a) and completing the square on the left hand side gives 



" 111 V" " r 1 — a / 1 — a 1 — a VI — aJ 



1 — a 
Finally, this gives 

I L,c*,^ 



The choice a = 5 proves 



i2 „ 1 fl + a -o „ r a , 

! < T <5 2 + 2p<5+- p + 

11 A V a 1 — a 



a 

1 /„ \ 2 



Aa(l — a) 



|2 



(tf + ap) . (13) 



j = for 5^0. 



□ 



For p = 1 the source condition says that u + must only have a finite number 
of non-zero entries. This is the natural limit for p — ► 1 as can be seen from 
Remark 13.51 

Thcorem l4.3l is remarkable since, as mentioned in Remark 13.31 the constant 
in the O-notation in Theorem 13.11 blows up to infinity for p — ► 1. Equation Q13p 
shows that the constant in the O-notation depends on the constant A from 
Lemma l4~2l and on p — \\8\\ Y /2 only. Basically the constant 1/k in Remark l3.3l 
has been replaced by 1/A from Lemma 14.21 

Remark 4.4 (The result of Hofmann et al. [15]). Hofmann et al. considered 
in [15] general convex regularization of operator equations in Banach spaces of 
the form 

\\F(u)-g 5 \\ P Y + aJ{u). 

They showed a convergence rate ofO{5) in the Bregman distance for non-smooth 
operators F under the source condition that there exists /3\ £ [0, 1[, fa > and 
£ £ dJ(u + ) such that 

- (£\u-u + ) < faD 6 (u,u + ) + fa\\F(u) - F{u+)\\ 

(note that the negative sign on the left hand side is a typo in the original paper). 
This source condition is difficult to check in concrete situations. Applied to the 
situation of Theorem \4-.3\ it reads as: There exists £ £ u)Sgn(u + ) such that 

- {£\u-u+) < /3iD(:(u,u + ) + fa \\K(u-u + )\\ Y . 

This condition is for example fulfilled if the sequence Wk is bounded and 

llu-u+IN < {PiR{u) + f3 2 \\K{u-u+)\\ Y ) 

" 111 max Wk " " 1 

which resembles the Bregman- Taylor estimate from Lemma 4.2. However, The- 
orem 4-3 gives a convergence rate of 0(y/S) in the l x -norm and the Bregman- 
Taylor estimate is only needed to pass from the Bregman distance to the i x -norm. 
Additionally, Theorem 4-3 needs the source condition t!2\) . 
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5 Regular izat ion with p < 1? 



The functional |T]) is not convex if p < 1. Hence, there is no guarantee for 
uniqueness or existence of a minimizer. In this section we show two extreme 
examples: One in which there exist minimizers which can be computed explicitly 
and regularization can be proven and the other where no minimizer exists at 
all. 



5.1 Regularization is possible 

In this example we use an orthonormal basis which is perfectly adapted to the 
operator: the singular basis. The singular value decomposition (<Tfc,V'fc,0fc) of 
the operator A consists of the singular values o~ k and two orthonormal bases ip k 
and <fik of X resp. Y. The operator A can now be expressed as 

Af = </IV>fc) 4>k- 

k 

Now we seek for a solution of Af = g which is sparse in the basis V'fc, i-e. we 
have Ufe — (f\ ipk) in |T|). Hence, the operator K = AB has the form 

Ku = ABu = 2J u k (T k <j) k (14) 
fc 

To express the minimizer of ((TJ) we need the following function: 

H*(x) = argmin {y - x) 2 + a\y\ p . (15) 

y 

Note that this function can be multivalued in general. The next lemma from [16] 
gives an implicit representation of the function HJ? . 

Lemma 5.1. Let 

GS(2/)=y + ^sgn( 2/ )|yr 1 . (16) 
The mapping is given by the following formulae: 

1. Let 1 < p < 2. Then (G^) _1 exists and is single valued and it holds 

H*{x) = (GD'Hx). 



2. Let p = 1. Then 

H\{x) = max(|x| — a/2, 0) sgn(x). 



3. Let0<p< 1. Then 

, for |x| < a cff 

Ha( x ) = \ the value of largest absolute value of the inverse , . . 

1 . „ ™ , tor x > a c ff 

mapping of G p a J 1 1 — 

(17) 

i 

- 2 ~p 



where a cS = ( a(l - p) 
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Figure 1: The thresholding functions for p = 0, 0.15, 0.3, 0.45, 0.6, 0.75, 0.9, 1 
and a = 3. 



4- Let p = 0. Then 




(x) 



, for \x\ < a eS 
x , for \x\ > a cS 



(18) 



where a e g = \fa. 



The description in 3. may be a little unfamiliar. For p < 1 the function G v a 
is multivalued in with G^(0) = R. Its inverse is again multivalued (in fact 
it has at most three values) and the function chooses either or the value 
of largest absolute value, see Figure [Hand [16] for more details. Note moreover 
that for p < 1 the function is multivalued itself, namely it has two values 
for | a; | = a c ff. For convenience we always choose the value at these points in 
the following. 

The next lemma is an easy consequence of the above lemma and the fact 
that the operator K is diagonal with respect to the basis (cf>k) of Y. 

Lemma 5.2. Let (4>k) be an orthonormal basis of Y and let the operator K : 
£ 2 — > Y be given by \1J$ . Then, a minimizer of (QJ) is given by 



Definition 5.3. For < p < 2 we define the operator :Y^£ 2 by 




, for a k > 
, for cjfc = 0. 



(19) 




for (T fe > 
for cr fc = 0. 



Note that RE, is non-linear and discontinuous 



Theorem 5.4. Let < p < 1. The operator iig is 



1. defined for every g G Y '. 
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2. a regularization, i.e. for g € dom.(K + ) it holds 

lim \\R p a g-K + g\\ = 0. 



Proof. We abbreviate g k = (g\ 4> k ). The pseudo-inverse is given by 
(K + g) k = 



g k /o- k , for CTfe > 
, for CTfe = 



and by the Picard condition this is an £ 2 sequence. For an M € N we write 
\\K{g)-K+ g f= 



|2 _ \Hl (g^-gtf 



<7 fc >0 



IK^Ag,) - g k f \H p a/< ( 9k )- 9k f 

2^ rr2 + 



a k >Q. k <M k a- k >0, k>M 



For a given e > we choose M such that J2a k >o k>M l5fc| 2 / <J fe < e - Since we 
can deduce from Lemma 15.11 



\H>(x)-x\<\x\ 

we can estimate 



i?p( 5 )-^+ 5 |l = }2 — LjL —2 + e - 



|2 _ 

cr fc >0, fc<M " K 

Furthermore, we see from Lemma 15. II 

HP(x) for a -> 0. 

and hence, for sufficiently small a we have 

||i?S( 5 )-^ + 5|| 2 <2e. 

□ 

The above theorem does only proof convergence on the range of the operator. 
To obtain results on the speed of convergence one may assume special sparseness 
or decay properties similar to [3]. We are not going to pursue further in this 
direction since the case of the singular basis is of limited interest in practical 
applications. Moreover, convergence for noisy data has not been shown. 



5.2 Regularization is impossible 

In this section we present an example where a sparsity constraint with exponent 
p = does not lead to a regularization. In particular the minimization of 
the Tikhonov functional is not well-posed in the sense that it does not have a 
solution. To this end, we design an operator A which does not act well on a 
given orthonormal basis (ijj k ). Let {h k } be a countable set which is dense in the 
unit-ball of Y, i.e. ||/ifc||y = 1 and for every g S Y with \\g\\ Y = 1 & n d every 
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e > there is an index ko such that \\g — h ko \\ Y < e - We define the operator A 
on the basis (ip k ) by 

Aipk — h k , i.e. Ku = ^2u k h k . (20) 

k 

Proposition 5.5. Let K be defined by (8U\) , \\g\\ Y > a an d let further g be not 
a multiple of hk for every k. Then the functional 

*(u) = \\Ku-g\\ Y + aJ2 s Sn{\uk\) 

k 

does not have a minimizer. 

Proof. Since the penalty term J2k s g n (l M fe|) does only depend on the number of 
coefficients we minimize separately over subspaces of a given dimension n. 

As first case we consider n = 0, i.e. we minimize just over u = 0. We observe 
that *(0) - \\g\\ Y . 

As second case we observe that $f(u) > 2a if u has more than two different 
non-zero entries. 

The last case is to minimize over the one-dimensional subspaces X k = 
span{efe} where e& is the canonical basis of t 2 . The values of ^ are 

*(rffeefe) = \\d k h k - g\\ Y + a. 

Since {h k } is dense in the unit ball may take d k = \\g\\ Y and find a sequence 
hi such that ||<7||y- hi — > g for I — > oo. Hence, the minimal value of ^ over all 
subspaces X k is a, i.e. 

inf ty(u) = a 
ue\jx k 

and this infimum is not attained since g is not a multiple of a basis vector h k . □ 

It is clear that a similar example can be constructed if the vectors h k accu- 
mulate at a single point: take g as the accumulation point of h k . 

Remark 5.6. We remark that also the constrained model 

Minimize ^^sgn(|zifc|) s.t. \\Ku — g S \\ Y < e {Pe) 

k 

is not well posed with K from \2(A) since it has an infinite number of solutions. 
One may say that this situation is a little better than that of Proposition 15.51 
since now solutions are available. An easy example shows, that regularization 
need not to happen in this setup. Let g + = hi and let \\g + — g S \\ Y < 5. The 
corresponding true solution is u + = e%. Then there is a sequence hi such that 
hi — ► h\ = g + . Moreover, for sufficiently large I, u e ' S — ei is a solution of fP £ [ ) 
with e = tS with r > 1 (assumed that the norm of g s is not too small). Finally, 
\\u e ' S — u + \\ = \pl is not converging to zero for e = tS and 5 — > 0. 
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6 Conclusions 



In this paper the regularizing properties of sparsity constraints have been an- 
alyzed. Special attention was payed to convergence rates in norm and to the 
source conditions. For 1 < p < 2 we could show, as a simple application of 
the results of Burger, Osher [7] and the inequality of Xu and Roach [23] (or 
the basic inequality in Lemma l2~3l from [4]), that a convergence rate y/5 in the 
2-norm can be achieved by a source condition saying that u + has to be in a 
weighted £ p space with small p, see Remark 13.51 

The case p — 1 needed a special technique: the Bregman- Taylor-distance 
from [5]. Applying this, a convergence rate y/6 in the stronger 1-norm could be 
achieved under the source condition that u + is finitely supported. 

The incipient discussion on regularization with p < 1 showed two things: 
First, regularization may or may not be possible and second, the regularization 
properties depend on the interplay of the operator A and on the choice of the 
basis functions (ipk) — a phenomenon which is not known for p > 1. One may 
conjecture that if the operator A acts well on the basis (ipk) (hi the sense that the 
values ||A^|f fc ||A^^|| are n °t too large) regularization is possible. This would 
parallel observations in the framework of compressed sensing on the mutual 
coherence of dictionaries, see [11]. 
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